Probability and Statistics: The Science of Uncertainty: From Constants to Random Variables: The Bayesian Paradigm

The fundamental shift in the Bayesian paradigm lies in the ontological status of the unknown parameter $\theta$. Unlike frequentist statistics, which treats $\theta$ as a fixed but unknown constant, the Bayesian approach treats $\theta$ as a random variable. This allows us to quantify uncertainty through a prior probability measure $\Pi$.

The Bayesian Model Construction

A complete Bayesian model is defined by the pair $(\{f_{\theta} : \theta \in \Omega\}, \Pi)$. Bayesian inference is not merely "using Bayes' Theorem," but the deliberate act of adding a prior probability distribution to the sampling model as an essential ingredient for inference.

The Joint Distribution

The total state of our knowledge is captured by the joint distribution $\pi(\theta) f_{\theta}(s)$. This function links the observed data $s$ and the unobserved parameter $\theta$ in a single coherent probabilistic framework.

Direct Probability Statements

In this paradigm, $\theta$ is governed by a probability density $\pi(\theta)$. This allows us to make direct probability statements about the parameter, such as $P(\theta \in A)$. This is logically impossible in a frequentist framework, where $\theta$ has no distribution and thus such statements are undefined.

⚠️ Critical Pitfall: The Posterior Axiom

Note that choosing to use the posterior distribution for probability statements about $\theta$ is an axiom, or principle, of the Bayesian school—not a theorem derived from more basic statistical truths. We assume that the posterior represents our updated state of rational belief.

Real-World Analogy: Medical Diagnostics

In diagnostics for a rare disease, the "constant" is whether a patient has the disease. In the Bayesian paradigm, we treat the disease status $(\theta)$ as a random variable. If the prevalence is 0.1% (the prior), and a test (the model $f_{\theta}$) returns positive, we do not just look at the test's accuracy; we look at the joint probability of having the disease AND testing positive to determine the new probability of illness.

🎯 Core Principle

Bayesian inference adds the prior probability distribution to the sampling model for the data as an additional ingredient to be used in determining inferences about the unknown value of the parameter.

QUESTION 1

What is the primary ontological difference between Frequentist and Bayesian statistics regarding the parameter θ?

Frequentists treat θ as a random variable; Bayesians treat it as a constant.

Frequentists treat θ as a fixed constant; Bayesians treat it as a random variable.

Both treat θ as a random variable but use different formulas.

Bayesians ignore the sampling model $f_\theta(s)$ entirely.

QUESTION 2

Which of the following defines a complete Bayesian model for inference?

Only the sampling model $\{f_\theta : \theta \in \Omega\}$

The pair $(\{f_\theta : \theta \in \Omega\}, \Pi)$

The maximum likelihood estimator $\hat{\theta}_{MLE}$

Only the prior distribution $\Pi$

QUESTION 3

True or False: Using the posterior distribution for inference is a mathematically derived theorem of Frequentist statistics.

True

False

QUESTION 4

What does the expression $\pi(\theta) f_{\theta}(s)$ represent in Bayesian inference?

The marginal distribution of the data

The joint distribution of the data and the parameter

The posterior density

The likelihood function

QUESTION 5

If the prior probability $P(\theta \in A) = 0.25$ and the posterior $P(\theta \in A | s) = 0.80$, how has the data affected our belief?

The data has provided evidence against $\theta \in A$.

The data has significantly increased our belief that $\theta$ lies in set $A$.

The data has had no effect on our belief.

The belief has become more subjective.

s	$f_1(s)$	$f_2(s)$	$f_3(s)$
1	1/3	1/2	1/4
2	2/3	1/2	3/4

$\theta$	$\pi(\theta)$
1	1/4
2	1/4
3	1/2